<?xml version="1.0" encoding="utf-8"?><!DOCTYPE article  PUBLIC '-//OASIS//DTD DocBook XML V4.4//EN'  'http://www.docbook.org/xml/4.4/docbookx.dtd'><article><articleinfo><title>Information Extraction from Polish free text</title><revhistory><revision><revnumber>8</revnumber><date>2019-10-08 15:36:06</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>7</revnumber><date>2011-11-21 01:02:23</date><authorinitials>MichalLenart</authorinitials></revision><revision><revnumber>6</revnumber><date>2011-03-27 13:11:34</date><authorinitials>AdamPrzepiorkowski</authorinitials></revision><revision><revnumber>5</revnumber><date>2011-03-27 09:02:14</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>4</revnumber><date>2011-03-27 09:01:49</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>3</revnumber><date>2011-03-25 15:45:50</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>2</revnumber><date>2011-03-25 15:19:06</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision><revision><revnumber>1</revnumber><date>2011-03-25 15:19:00</date><authorinitials>MaciejOgrodniczuk</authorinitials></revision></revhistory></articleinfo><section><title>Information Extraction from Polish free text</title><section><title>Project factsheet</title><informaltable><tgroup cols="2"><colspec colname="col_0"/><colspec colname="col_1"/><tbody><row rowsep="1"><entry colsep="1" rowsep="1"><para> Polish name:          </para></entry><entry colsep="1" rowsep="1"><para> Opracowanie narzędzi do ekstrakcji informacji z tekstów w języku polskim </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Project type:         </para></entry><entry colsep="1" rowsep="1"><para> A national <ulink url="http://www.eng.nauka.gov.pl/meinen/">Ministry of Science and Higher Education</ulink> research grant (number 3T11C00727) </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Duration:             </para></entry><entry colsep="1" rowsep="1"><para> 20 October 2004 ‒ 19 October 2007 </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Principal investigator: </para></entry><entry colsep="1" rowsep="1"><para> Agnieszka Mykowiecka </para></entry></row><row rowsep="1"><entry colsep="1" rowsep="1"><para> Institution:          </para></entry><entry colsep="1" rowsep="1"><para> Linguistic Engineering Group, Institute of Computer Science, Polish Academy of Sciences </para></entry></row></tbody></tgroup></informaltable></section><section><title>Project description</title><para>Motivations: </para><itemizedlist><listitem><para>not many efforts on IE on Polish texts in contrast to many existing applications for many languages, </para></listitem><listitem><para>existing IE tools could not be directly used for processing Polish. </para></listitem></itemizedlist><para>Goals: </para><itemizedlist><listitem><para>adapting chosen IE tools for processing Polish, </para></listitem><listitem><para>collecting some linguistic resources for IE. </para></listitem></itemizedlist><para>Activities: </para><itemizedlist><listitem><para>adapting IE platforms SProUT and (recently) GATE for tokenization and morphological analysis of Polish texts, </para></listitem><listitem><para>collecting resourses and IE grammars for named entities recognition (NER) in Polish texts, </para></listitem><listitem><para>ruled based IE experiments in a selected domain (medical texts), </para></listitem><listitem><para>testing methods of terminology extraction on Polish data. </para></listitem></itemizedlist></section></section></article>